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24 Abstract 

25 Mullerian mimicry among Neotropical Heliconiini butterflies is an excellent example of natural 

26 selection, and is associated with the diversification of a large continental-scale radiation. Some of 

27 the processes driving the evolution of mimicry rings are likely to generate incongruent phylogenetic 

28 signals across the assemblage, and thus pose a challenge for systematics. We use a dataset of 22 

29 mitochondrial and nuclear markers from 92% of species in the tribe to re-examine the phylogeny of 

30 Heliconiini with both supermatrix and multi-species coalescent approaches, characterise the 

3 1 patterns of conflicting signal and compare the performance of various methodological approaches to 

32 reflect the heterogeneity across the data. Despite the large extent of reticulate signal and strong 

33 conflict between markers, nearly identical topologies are consistently recovered by most of the 

34 analyses, although the supermatrix approach fails to reflect the underlying variation in the history of 

35 individual loci. The first comprehensive, time-calibrated phylogeny of this group is used to test the 

36 hypotheses of a diversification rate increase driven by the dramatic environmental changes in the 

37 Amazonia over the past 23 million years, or changes caused by diversity-dependent effects on the 

38 rate of diversification. We find that the tribe Heliconiini had doubled its rate of speciation around 

39 11 Ma and that the presently most speciose genus Heliconius started diversifying rapidly at 10 Ma, 

40 likely in response to the recent drastic changes in topography of the region. Our study provides 

41 comprehensive evidence for a rapid adaptive radiation among an important insect radiation in the 

42 most biodiverse region of the planet. 
43 

44 Keywords: multispecies coalescent, speciation rate, incongruence, mimicry, Lepidoptera, 

45 Amazonia, Miocene 
46 
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50 Visual mimicry provides an excellent system in which to study the origins of biodiversity, as 

5 1 the targets of selection are clearly identifiable and the role of natural selection in promoting 

52 adaptation and ultimately speciation can be directly observed (Bates 1863; Benson 1972; Mallet and 

53 Barton 1989; Sherratt 2008; Pfennig 2012). Studies of mimetic assemblages have been instrumental 

54 in explaining many biological phenomena, ranging from reproductive isolation (McMillan et al. 

55 1997; Jiggins et al. 2001; Merrill et al. 2012) to genetics of adaptation (Benitez-Vieyra et al. 2007; 

56 Baxter et al. 2009; Jones et al. 2012), and including prominent examples from vertebrates (Brodie 

57 III and Brodie Jr 2004, Wright 201 1), arthropods (Ceccarelli and Crozier 2007, Janzen et al. 2010, 

58 Hines and Williams 2012), plants (Benitez-Vieyra et al. 2007), and microbes (Elde and Malik 

59 2009). 

60 Testing hypotheses regarding the evolution of mimicry depends heavily on our knowledge 

61 of systematic relationships between the participating taxa, particularly where mimics are closely 

62 related. Phylogenetic comparative approaches have been used to address questions such as 

63 concerted coevolution of Miillerian co-mimics in catfish (Wright 201 1), advergence due to 

64 directional selection in Batesian mimicry of spiders (Ceccarelli and Crozier 2007) and selection on 

65 imperfect mimics among the hoverflies (Penney et al. 2012). Unfortunately, both hybridisation 

66 driven by strong selection on adaptive loci and rapid radiation leading to incomplete lineage sorting 

67 are likely to be common in mimetic systems (Savage and Mullen 2009; Kubatko and Meng 2010; 

68 Kunte et al. 201 1; Zhang et al. 2013) and can significantly interfere with the estimation of 

69 phylogenies (Maddison and Knowles 2006; Linnen and Farrell 2008; Edwards 2009; Anderson et 

70 al. 2012). 

71 We use the Neotropical butterfly tribe Heliconiini (Nymphalidae: Heliconiinae) to explore 

72 the problem of phylogeny estimation in a Miillerian mimetic group and investigate the link between 

73 the dynamics of speciation and macroevolutionary factors. Heliconiini are arguably the most 

74 thoroughly-researched example of microevolution in the Neotropics, the most biodiverse region of 

75 the world (Hoorn et al. 2010). They comprise the genus Heliconius and nine smaller genera, 
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76 providing a spectacular example of a radiation where speciation is frequently caused by mimicry of 

77 aposematic patterns directed against avian predators (Jiggins et al. 2001; Arias et al. 2008; Merrill 

78 et al. 2012). The system has provided an excellent example for the study of convergence from both 

79 genomic and organismal perspective (e. g. Duenez-Guzman et al. 2009; Hines et al. 201 1; Bybee et 

80 al. 2012; Heliconius Genome Consortium 2012; Martin et al. 2012, 2013; Pardo-Diaz et al. 2012; 

81 Jones et al. 2013; Nadeau et al. 2013; Supple et al. 2013). The clade presents an opportunity for 

82 comparative studies of many unique traits, including larval and imaginal sociality (Beltran et al. 

83 2007), pupal mating (Estrada et al. 201 1), pollen feeding (Cardoso and Gilbert 2013), or close 

84 association with varying numbers of Passifloraceae host plants (Brown 1981; Merrill et al. 2013). 

85 The ongoing proliferation of comparative molecular and genomic studies on the diversity of genes 

86 regulating vision (Pohl et al. 2009), chemosensation (Briscoe et al. 2013) and cyanogenesis 

87 (Chauhan et al. 2013) makes a stable molecular phylogeny especially desirable. 

88 An unusual feature of Heliconius is the prevalence and importance of gene flow and 

89 hybridisation, leading to a controversy over the validity of traditional species concepts in the clade 

90 (Beltran et al. 2002; Mallet et al. 2007). At least 26% of all species of Heliconiini occasionally 

91 produce hybrids in the wild (Mallet et al. 2007), and H. heurippa has resulted from homoploid 

92 hybrid speciation from parental forms diverged millions of years ago (Mavarez et al. 2006; Jiggins 

93 et al. 2008; Salazar et al. 2010). Genome sequencing has shown that mimetic diversity in the H. 

94 melpomene and silvaniform clades is also explained by adaptive introgression of genes regulating 

95 the aposematic wing patterns (Heliconius Genome Consortium 2012; Pardo-Diaz et al. 2012), and 

96 seemingly neutral gene flow is widespread, influencing as much as 40% of the genome between H. 

97 melpomene and H. cydno (Martin et al. 2013). 

98 A large body of recent work has been devoted to the issue of incongruence between the 

99 species tree (the true speciation history) and the gene trees evolving within (Anderson et al. 2012; 

100 Cutter 2013). The traditional approach of concatenating the total genetic evidence into a 

101 supermatrix in order to obtain a global estimate of the predominant phylogenetic signal and hidden 
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102 support (Gatesy and Baker 2005), without consideration for the heterogeneity of individual 

103 partitions, has been to some extent superseded by multispecies coalescent (MSC) techniques 

104 (reviewed in: Edwards 2009; Knowles and Kubatko 2010; Anderson et al. 2012; Cutter 2013; 

105 Leache et al. 2013). The majority of the new MSC algorithms are intended to model at least some of 

106 the sources of heterogeneity between different markers, most frequently focusing on the problem of 

107 incomplete lineage sorting (e.g. Maddison and Knowles 2006; Heled and Drummond 2010), 

108 sometimes addressing hybridisation (e.g. Gerard et al. 201 1; Yu et al. 201 1), and in at least one case 

109 modelling discordance without specifying its potential source (Larget et al. 2010). Although the 

1 10 supermatrix approach remains popular and serves as an effective approximation of the species 

1 1 1 diversification history in most cases, its ability to properly assess the degree of statistical support 

112 for phylogenies has been brought into question and contrasted with the potential of the MSC 

113 techniques to assess confidence more realistically (Edwards 2009; Knowles and Kubatko 2010). 

1 14 Heliconiini are an especially interesting subject for a systematic study, where the purported 

115 robustness of MSC tools to gene flow and other sources of incongruence can be tested with real 

116 data and used to evaluate support in the light of underlying heterogeneity of the phylogenetic signal. 

117 Heliconiini systematics has a long history, starting with early morphological work (Emsley 

118 1965; Brown 1981; Penz 1999), through allozymes (Turner et al. 1979) and a combination of 

119 morphological and RNA-restriction data (Lee et al. 1992), to studies based on the sequences of 

120 mitochondrial and nuclear markers (Brower 1994b; Brower and Egan 1997; Beltran et al. 2002, 

121 2007; Cuthill and Charleston 2012). The most comprehensive study to date by Beltran et al. (2007) 

122 attempted to address some of the difficulties by incorporating many taxa (38 of 46 Heliconius, 59 of 

123 78 Heliconiini), considering two individuals of most species, and sequencing two mitochondrial 

124 (CoI/II, 16S) and four nuclear markers {EFla, Wg, Ap, Dpp). However, this dataset is still 

125 potentially inadequate to address the challenges posed by Heliconiini systematics, as three of the 

126 loci (16S, Ap, Dpp) were only sequenced for 12 representative species. Among the other three 

127 markers, 65% of the variable sites resided in the fast-evolving mitochondrial partition Col/11, 
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128 raising the possibility that the inferred relations are largely driven by the historical signal of the 

129 matriline. The proposed relationships between the basal genera of the tribe were poorly supported 

130 and the relationships between Heliconius, Eueides and the other eight genera was not resolved, as 

131 might be expected if most of the data comes from a fast-evolving partition. Importantly, the 

132 individual data partitions were analysed in concatenation, as the debate on the species tree-gene tree 

133 incongruence was in its early stages at the time. Despite these shortcomings, Beltran and colleagues 

134 confirmed that the morphological and behavioural characteristics of the major clades do not 

135 correspond directly to their evolutionary history and that the traditionally recognised genera 

136 Laparus and Neruda are most likely nested within the crown genus Heliconius. 

137 The importance of Mullerian mimicry as a driver of individual speciation events has been 

138 well-established (Mallet et al. 1998; Mallet and Joron 1999; Jiggins 2008) whereas the 

139 macroevolutionary processes governing the evolution of the group have been largely neglected in 

140 empirical studies (but see Etienne et al. 2012; Rosser et al. 2012). Precise understanding of the 

141 evolution of the Mullerian mimicry rings as well as associated processes such as hybridisation, 

142 requires knowledge of the relative timing of the divergence events and motivated the most widely- 

143 cited study of the molecular clock in Arthropoda (Brower 1994a). Mallet et al. (2007) used a 

144 relaxed clock procedure to adjust the original branch lengths based on the CoI/II alignment. 

145 However, some of the relations were left unresolved, and the sequences were not partitioned into 

146 fast and slow-evolving codon positions, which can result in inflated age estimates (Brandley et al. 

147 2011). 

148 Most importantly, the first dated phylogeny of Heliconiini is not calibrated to an absolute 

149 standard, making it impossible to make inferences about the relation of the diversification process 

150 and the contemporaneous geological and climatic events. The cumulative results of over 200 

151 systematic studies demonstrate that most South American tropical clades have experienced periods 

152 of significantly elevated net diversification rate in response to Andean orogenesis, alterations in the 

153 hydrology and sediment dynamics of the present-day Amazon Basin, as well as local and global 
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154 climatic changes (Hoorn et al. 2010; Turchetto-Zolet et al. 2013). These processes can result in 

155 allopatry of incipient lineages, or in creation of new ecological niches and change to the species- 

156 level carrying capacity of the environment leading to ecological speciation (Brumfield and Edwards 

157 2007; Hoorn et al. 2010). A recent comparative study suggests that the majority of Heliconius 

158 lineages originated in the North-Eastern Andes and spread to other parts of the continent (Rosser et 

159 al. 2012). Three periods stand out as potentially critical in the history of Heliconiini as the times 

160 when orogenic, hydrologic and climatic events would have created new habitats and altitudinal 

161 gradients. The first phase of the intense Andean uplift, taking place 23 Ma, was also the start of the 

162 great marine incursion into the North of the continent, separating the incipient mountain range from 

163 the Eastern plateaus (Gregory- Wodzicki 2000; Solomon et al. 2008; Hoorn et al. 2010). 12.4 Ma 

164 marked the end of the Mid-Miocene Climatic Optimum and the onset of global cooling (Lewis et al. 

165 2007), which likely led to the reduction and separation of the rainforest areas by savannas 

166 unsuitable for many Heliconiini (Jaramillo et al. 201 1). Coincident was the second stage of rapid 

167 Andean orogeny 12 Ma, strongly changing the elevation gradients in the Central and Eastern sectors 

168 (Gregory- Wodzicki 2000), and followed shortly by the entrenchment of the Amazon in its modern 

169 course 10 Ma, and a gradual expansion of the rainforest in place of the retreating wetlands during 

170 the next 3 Myr (Hall and Harvey 2002, Figueiredo 2009, Hoorn 2010). Finally, the Eastern 

171 Colombian Cordillera uplifted in the last burst 4.5 Ma and the Isthmus of Panama connected at least 

172 3 Ma, possibly changing the patterns of isolation between various populations (Hill et al. 2013). 

173 Hence we can hypothesise that the diversification rate of Heliconiini increased during the periods of 

174 intensive Andean uplift around 23, 12 and 4.5 Ma. 
175 

176 Aims of the Study 

111 Here we aim to resolve the species tree of the Heliconiini radiation and generate a dataset 

178 including virtually all of the currently valid species in the tribe, sampling intraspecific diversity 

179 across the range of many of the species. We include information from 20 nuclear and two 
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180 mitochondrial markers, as well as the whole mitochondrial sequence of select species. The exonic 

181 markers comprise both autosomal and Z-linked loci, chosen to represent a sampling of 

182 chromosomes and a range of evolutionary rates, from fast-evolving mitochondrial genes (e.g. 

183 CoI/II) to more slow evolving nuclear loci (e.g. Wg). We apply a wide range of phylogenetic 

184 methods to reconstruct the species tree, including supermatrix, coalescent and network approaches, 

185 which allow us to assess the strength of the underlying signal of speciation. The power of our 

186 combined approach is harnessed to detect the incongruence between markers, likely to arise from 

187 the complex population-level processes acting in a radiation of Mullerian co-mimics. We elucidate 

188 the importance of marker heterogeneity for the final assessment of systematic relationships, while 

189 realistically estimating the support values for our chosen topology. A robust estimate of the 

190 relationships between 92% of the species and multiple outgroups makes it possible to date the time 

191 of individual divergence events with confidence, providing input for an analysis of diversification 

192 dynamics. The calibrated chronogram is used to test our hypothesis of diversification rate increase 

193 at times of dramatic environmental changes in the Andes and Amazonia, as well as the previous 

194 suggestions of diversity-dependent cladogenesis in Heliconiini (Fordyce 2010; Etienne et al. 2012). 

195 We thus present a comprehensive study of macroevolutionary dynamics in a mimetic system that 

196 has been studied intensively at the microevolutionary level. 
197 
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1 98 Materials And Methods 

199 

200 Taxon Sampling 

201 We sampled 180 individuals, including 71 of the 78 species in all genera of Heliconiini and 

202 1 1 outgroup species. The specimens came primarily from our collection at the University of 

203 Cambridge, with additional specimens shared by museums and private collectors (Online Appendix 

204 1). We included five outgroup species from the sister tribe Acraeini (Wahlberg et al. 2009) and 

205 three from the related genus Cethosia. The diverse analyses used in this paper require different 

206 sampling designs and the demands of all the techniques cannot be easily accommodated in a single 

207 dataset. For example, the network analysis based on nucleotide distance produced much better 

208 supported and resolved trees when the 95% or more incomplete data from historical specimens were 

209 not used, whereas the various multi-species coalescent techniques required the use of at least two 

210 individuals per species and had to be based only on taxa with intraspecific sampling (Fulton and 

21 1 Strobeck 2009; Heled and Drummond 2010). Thus we distinguish four datasets. The complete data 

212 matrix includes all the data. The core dataset excludes 14 individuals represented solely by short 

213 DNA fragments from historical specimens. The single-individual dataset includes both modern and 

214 historical specimens, but with only the single best-sequenced individual per taxon. Finally, the 

215 * BEAST matrix contains only the 17 species of Eueides and Heliconius with extensive sampling of 

216 multiple representatives of each species. 

217 

218 DNA Sequencing 

219 We used 20 nuclear and two mitochondrial loci as markers (Table 1; Online Appendix 1), 

220 exceeding the number of loci considered sufficient for most phylogenetic problems (Gatesy et al. 

221 2007). The selection includes the three classic molecular markers for Lepidoptera (CoI/II, EFla, 

222 Wg), two markers proposed by Beltran et al. (2007)(i 6S, Dpp), eight new universal markers 

223 proposed by Wahlberg and Wheat (2008) {ArgK, Cad, Cmdh, Ddc, Idh, Gapdh, Rps2, Rps5) and 



Downloaded from http://biorxiv.org/ on September 18, 2014 

224 nine highly variable loci identified by Salazar et al. (2010) (Aact, Cat, GlyRS, Hcl, Hsp40, Lm, 

225 Tada3, Trh, Vas). Additional Heliconius-specific primers were designed for Cmdh, Gapdh and Idh. 

226 Details of the primers and PCR cycles are listed in the Online Appendix 2. For most species, 

227 sequences of the three basic markers for multiple individuals were already published (Brower 1997; 

228 Beltran et al. 2007; Wahlberg et al. 2009; Salazar et al. 2010), and data for 26 individuals came 

229 exclusively from GenBank (Online Appendix 1). 

230 We generated Sanger sequences for 103 specimens (Online Appendix 1). DNA was isolated 

231 from approximately 50 ug of thorax tissue using the DNeasy Blood & Tissue kit (Qiagen, 

232 Manchester, UK). PCR was carried out in a total volume of 20 ul, containing lx Qiagen Taq buffer 

233 (Manchester, UK), 2.5 mM MgCl 2 , 0.5 uM of each primer, 0.2 mM dNTPs, 1 unit bovine serum 

234 albumin, 0.5 unit Qiagen Taq-Polymerase and 1 ul of the DNA extract. The following program was 

235 executed on a G-Storm cycler (Somerton, UK): denaturation 5 minutes at 94°C; 35 cycles of 30 

236 seconds at 94°C, 30 seconds at the annealing temperature and 90 seconds at 72°C; final extension 

237 for 10 minutes at 72°C. The results were visualised by electrophoresis in 1.5% agarose gel stained 

238 with 1% ethidium bromide. PCR products were cleaned using the ExoSAP-IT system (USB, 

239 Cleveland, Ohio): 60 minutes at 37°C; 20 minutes at 80°C. We used gel purification with the 

240 Nucleo Spin Extract II kit (Macherey-Nagel, Dueren, Germany) as needed. Sanger sequencing 

241 reaction was carried out with the BigDye Terminator v. 3.1 (AB, Foster City, California: 2 minutes 

242 at 94°C; 25 cycles of 10 seconds at 94°C, five seconds at 50°C and four minutes at 60°C. The 

243 products were sequenced with the ABI 3730x1 DNA Analyzer at the Sequencing Facility, 

244 Department of Biochemistry, University of Cambridge. We manually inspected the traces in 

245 CodonCode v. 4.0.4 using PHRED for quality assessment (CodonCode Corporation 2012). 

246 At the time of our Sanger sequencing effort, whole genome data generated for other studies 

247 became available from 57 individuals in 27 common species (Heliconius Genome Consortium 

248 2012; Briscoe et al. 2013; Supple et al. 2013; Dasmahapatra and Mallet, unpublished) (Online 

249 Appendix 1). 100 base pair reads were generated using the Illumina Genome Analyzer II and HiSeq 
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250 2000 platforms with insert size of 300-400 bp. We performed de novo assembly of the short reads 

251 in the program Abyss v. 1.3 (Simpson et al. 2009). Based on previous studies (Salzberg et al. 2012; 

252 Briscoe et al. 2013) and preliminary results (S. Baxter, pers. comm.), we chose k-mer length of 3 1, 

253 minimum number of pairs n=5 and minimum mean coverage c=2 as optimal settings. The 20 

254 nuclear markers were mined from the assemblies by megaBLAST (Camacho et al. 2009) in 

255 Geneious v. 5.5.1 (Biomatters Ltd 2012) using reference sequences from the Heliconius melpomene 

256 genome. The quality of the recovered sequences was assessed by alignment to previously generated 

257 amplicon sequences of the same loci from the same individuals. 

258 Mitochondrial sequences could not be recovered by de novo methods, presumably because the 

259 large number of reads from the highly abundant mitochondrial DNA contained a high enough 

260 number of erroneous sequences to interfere with the assembly. We reconstructed whole 

261 mitochondrial genomes of 27 species by mapping to the Heliconius melpomene reference (The 

262 Heliconius Genome Consortium, 2012), using the default settings in the Genomics Workbench v. 

263 5.5.1 (CLCBio 2012). This data was analysed separately from the 21 locus mixed nuclear- 

264 mitochondrial alignment. All the original sequences were deposited in GenBank (Accession 

265 numbers - data in submission; Online Appendix 1). 
266 

267 DNA Sequencing: Historical Specimens 

268 Short fragments of CoI/II and EFla were sequenced from historical specimens up to 150 

269 years of age, obtained from museum and private collections (Online Appendix 1) and processed in a 

270 vertebrate genetics laboratory to reduce the risk of contamination. Instruments and surfaces were 

271 cleaned with 5% bleach and irradiated with UV for 30 minutes prior to use. One to two legs were 

272 washed in water, immersed in liquid nitrogen in a test tube for 30 seconds and ground up, followed 

273 by an extraction into 20 (0,1 of buffer using the QIAmp DNA Micro Kit (Qiagen, Manchester, UK). 

274 We treated every fifth extraction and every fifth PCR as a negative control with no tissue or DNA 

275 extract. PCR reactions were carried out in a 20 (0,1 volume using 1 unit of Platinum HiFi Taq 
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276 Polymerase (Invitrogen, London, UK) and lx buffer, 2.5 mM MgCb, 0.5 uM of each primer, 0.2 

277 mM dNTPs, 1 unit bovine serum albumin, sterilised DNAse-free water and 1 -5 ul of the DNA 

278 extract depending on concentration. In order to accommodate shearing of DNA with time, we 

279 designed and applied PCR primers spanning short fragments of 200-300 bp (Online Appendix 2). 

280 We carried out amplification, product clean up and sequencing as above, partially accounting for 

281 possible cross-contamination by blasting the results against GenBank. 
282 

283 Alignment and Gene Tree Estimation 

284 Alignments for each locus were generated in CodonCode v. 3 to account for inverted and 

285 complemented sequences, and improved using MUSCLE v. 3.8 (Edgar 2004). We visualised the 

286 alignments of the coding loci (all except the mitochondrial 16S and the tRNA-Leu fragment in the 

287 CoI/II sequence) in Mesquite v. 2.75 (Maddison and Maddison 201 1) and checked translated 

288 sequences for stop codons indicating errors. The whole mitochondrial sequences were aligned to the 

289 Acraea issoria and Heliconius melpomene references (Heliconius Genome Consortium 2012) using 

290 the G-INS-i algorithm in MAFFT (Katoh 2002). The number of variable and parsimony informative 

291 sites was estimated for each locus in PAUP* v. 4 (Swofford 2002). Models of sequence evolution 

292 implemented in MrBayes (Ronquist and Huelsenbeck 2003) and BEAST (Drummond and Rambaut 

293 2007) were selected in MrModelTest v. 2.3 (Posada and Crandall 1998; Nylander 2004) based on 

294 the Akaike's AIC (Akaike 1974). Xia's test in DAMBE v. 4.0 (Xia and Xie 2001) demonstrated 

295 saturation in the third codon position of CoI/II. Protein-coding mitochondrial markers often display 

296 high saturation at the third codon position, potentially leading to incorrect parametrisation of 

297 substitution rate models (Brandley et al. 201 1). To account for this effect, in all the subsequent 

298 analyses we treated the third codon position of the fast-evolving CoI/II locus as a separate partition. 

299 The Leucine tRNA (tRNA-Leu) fragment occurring in the middle of CoI/II displays very low 

300 variability and thus was included in one partition with the slower evolving first and second codon 

301 positions. Individual gene trees were estimated in MrBayes v. 3.1, using four runs of one chain, 10 
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302 million MCMC cycles sampled every 1000 cycles, and 2.5 million cycles discarded as burnin based 

303 on the convergence diagnostic. The mitochondrial genes were concatenated due to their shared 

304 history, but treated as separate partitions with distinct models. All trees were visualised with 

305 FigTree v. 1.4 (Rambaut 2009). 

306 

307 Detection of Conflicting Signals 

308 We investigated the cyto-nuclear discordance and other conflicts in the phylogenetic signal 

309 with several methods. To illustrate the global reticulate signal in the data, a NeighborNet network 

310 was built in the program SplitsTree v. 4 (Kloepper and Huson 2008) with the pairwise distances 

311 calculated under the F84 correction for multiple hits. We reduced the dataset to a single best- 

312 sequenced individual per species in order to exclude the reticulations resulting from the expected 

313 recombination within species. Next, the topological disparity among individual loci was illustrated 

314 using Multi-Dimensional Scaling of pairwise Robinson-Foulds distance (Robinson and Foulds 

315 1981) between the gene trees, as estimated by TreeSetViz v. 1.0 (Hillis et al. 2005) in Mesquite. 

316 Calculation of the RF required trimming the trees to the minimal set of 54 shared taxa from 27 

317 species, using the R package APE (Paradis et al. 2004; R Development Core Team 2008). Finally, 

318 we investigated if topologies and branch lengths of the individual loci are consistent enough to be 

319 concatenated, by means of a hierarchical likelihood ratio test in Concaterpillar v. 1.5 (Leigh et al. 

320 2008). 
321 

322 Supermatrix Phylogenetics 

323 We created a supermatrix of the 20 nuclear and 2 mitochondrial markers in Mesquite and 

324 estimated the Maximum Likelihood (ML) phylogeny under the GTRGAMMA model in RAxML v. 

325 7.0.4, with 1000 bootstrap replicates under the GTRCAT approximation (Stamatakis 2006). To 

326 explicitly test the likelihood of various hypotheses for Heliconiini phylogeny, several alternative 

327 topologies were created in Mesquite, representing previously identified groupings, as well as 
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328 suggested placements of the enigmatic genera Cethosia, Laparus and Neruda (Beltran et al. 2007, 

329 Brower 1994, Brower & Egan 1997, Mallet et al. 2007, Penz 1999, Penz & Peggie 2003). We then 

330 re-estimated the ML tree under the GTRGAMMA model using each topology as a constraint. The 

331 likelihood scores of the original and alternative trees were compared using the Shimodaira- 

332 Hasegawa test (Shimodaira and Hasegawa 1989) and the Expected Likelihood Weights based on 

333 1000 bootstrap replicates (Strimmer and Rambaut 2002). A separate phylogeny was generated for 

334 the unpartitioned whole mitochondrial alignment in RAxML under the GTRGAMMA model with 

335 1000 bootstrap replicates. 

336 We estimated a calibrated Bayesian phylogeny using the program BEAST v. 1.7.5 

337 (Drummond et al. 2006). In order to avoid incorrect estimates of the substitution rate parameters 

338 resulting from the inclusion of multiple samples per species, this analysis was based on a pruned 

339 alignment with one individual per species. The only exception is the inclusion of 3 races of H. 

340 melpomene and 2 races of H. erato, where deep geographical divergences are found (Quek et al. 

341 2010). The Vagrantini sequences were not used, as BEAST can estimate the placement of the root 

342 without an outgroup (Drummond et al. 2006). Thus the analysis included 78 taxa and the 

343 substitution rate models were re-estimated appropriately (Table 1). We linked the topology, but 

344 modelled an uncorrelated lognormal clock and the substitution rate separately for each locus 

345 (Drummond et al. 2006). Substitution rates were drawn from the overdispersed gamma distribution 

346 prior with shape parameter k=0.001, scale parameter theta=l and starting value 0.001 for nuclear 

347 genes, and k=0.01 for the faster-evolving mitochondrial loci (van Velzen 2013). We used a Birth- 

348 Death tree prior and empirical base frequencies to limit the computation time for the heavily 

349 parametrised model. As no fossils of Heliconiini or closely related tribes are known, we used a 

350 secondary calibration point from the dated phylogeny of Nymphalidae (Wahlberg et al. 2009; van 

351 Velzen et al. 2013). The age of the root was modelled as normally distributed with a mean of 47 Ma 

352 and a standard deviation of 3.0 Ma, corresponding to the 95% confidence intervals of the Acraeini- 

353 Heliconiini split found by Wahlberg et al. (2009). Four independent instances of the MCMC chain 
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354 were ran for 100 million cycles each, sampling the posterior every 10000 cycles and discarding 10 

355 million cycles as burnin. The input .xml file generated in BEAUti v. 1.7.5 (Drummond et al. 2010) 

356 can be found in the Online Appendix 3. To ensure that the results are driven by the data and not the 

357 priors, we executed a single empty prior run. Convergence of the continuous parameters was 

358 evaluated in Tracer v. 1 .4, and the Maximum Clade Credibility tree with mean age of the nodes was 

359 generated using LogCombiner v. 1.7.5 and TreeAnalyser v. 1.7.5 (Rambaut and Drummond 2010). 
360 

361 Multispecies Coalescent Phylogenetics 

362 To account for the heterogeneous phylogenetic signal resulting from gene flow, hybridisation 

363 and incomplete lineage sorting, we applied a variety of multispecies coalescent (MSC) analyses that 

364 take as input both the raw alignment and individual gene trees). We first used the established 

365 method of minimising deep coalescences (MDC) (Maddison and Knowles 2006), taking advantage 

366 of a dynamic programming implementation in the package PhyloNet (Than et al. 2008). 100 

367 bootstrap input files containing 100 trees were drawn randomly without replacement from the 

368 distribution of Bayesian gene trees for the 21 loci, an MDC phylogeny was estimated for each input 

369 and a 50% majority rule consensus was taken. 

370 Bayesian Concordance Analysis (BCA) is an MSC method that attempts to reconcile the 

371 genealogies of individual loci based on posterior distributions, regardless of the sources of conflict 

372 (Larget et al. 2010). BCA generates Concordance Factors (CFs), which show what proportion of 

373 loci contain a particular clade, and estimates the primary phylogenetic hypothesis from the best- 

374 supported clades. CFs offer a powerful alternative to traditional measures of support and can be 

375 conveniently estimated in the program BUCKy (Ane et al. 2007; Larget et al. 2010). We executed 

376 two runs of one million MCMC cycles in BUCKy based on the 21 posterior distributions of gene 

377 trees from MrBayes. 

378 Another approach to the multispecies coalescent is to estimate the gene trees and the species 

379 tree simultaneously, explicitly modelling the sources of incongruence (Edwards et al. 2007). We 
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380 applied this technique using *BEAST, a program harnessing the power of BEAST and 

381 simultaneously implementing a powerful MSC algorithm that estimates the species tree and the 

382 embedded gene trees, as well as the population sizes of the lineages (Heled & Drummond 2010). 

383 Effective calculation of the population size parameters requires a thorough multilocus sampling of 

384 each species in the analysis, which forced us to reduce the dataset to species with a minimum of 

385 three individuals from at least two distinct populations (the *BEAST dataset ). The final alignment 

386 included 87 terminal taxa in 17 species of Eueides and Heliconius. We re-estimated the individual 

387 substitution models for each partition, used a constant population size coalescent tree model and 

388 implemented other priors as described above for BEAST. We carried out four independent runs of 

389 500 million cycles each, sampling every 10000 cycles, generated a maximum clade credibility 

390 species tree and visually summarised the 21 gene trees by plotting in DensiTree (Bouckaert 2010). 

391 Online Appendix 4 contains the .xml file for this analysis. 

392 Sequence alignments and phylogenetic trees were deposited in TreeBase 

393 (http://purl.org/phylo/treebase/phylows/study/TB2 : S 1 5 5 3 1 ) . 
394 

395 Changes in the Diversification Rate 

396 The formal analyses of diversification dynamics were based on the output of the Bayesian 

397 supermatrix analysis in BEAST and conducted in R. Errors in divergence time estimates are not 

398 expected to affect the divergence rate calculations significantly (Wertheim and Sanderson 201 1), 

399 and the semi-log lineage through time plots (LTT) of 200 randomly selected posterior trees formed 

400 a narrow distribution around the maximum clade credibility (MCC) tree with mean node ages. 

401 Hence we based all the analyses on the MCC tree of Heliconiini with one individual per species, 

402 excluding outgroups and Cethosia. 

403 We searched for the signal of change in diversification rate using TurboMEDUSA, which 

404 identifies shifts a posteriori through stepwise AIC (Alfaro et al. 2009). We wanted to investigate 

405 whether the Heliconiini have undergone significantly faster diversification during the periods of 
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406 dramatic environmental change at either 12 or 4.5 Ma, whether the diversification rate of 

407 Heliconiini decreased with time due to limiting species density effects (Fordyce 2010; Etienne et al. 

408 2012b), or increased as new lineages evolved novel colour patterns and drove positive mimetic 

409 interactions that reduce competition (Elias et al. 2009). To compare these possibilities explicitly, we 

410 fitted maximum likelihood models of increasing complexity to the MCC tree, accounting for the 

41 1 uncertainty in the estimates of recent speciation (protracted speciation) and extinction rates (pull of 

412 the present) by truncating the phylogeny to 1 Ma before present (Etienne and Haegeman 2012). The 

413 following models were fitted and compared using the Akaike weights for all Heliconiini and for 

414 Heliconius alone: pure birth, with a constant rate or with a shift in rate (PB); birth-death with and 

415 without changes in rate (BD); diversity dependent model without extinction (DDL); diversity 

416 dependent model with linear terms for speciation or both speciation and extinction (DDE); DDE 

417 with rate shifts. We included rate shifts at an unspecified time, at 12 and 4.5 Ma, as well as at 1 1 

418 and 3.5 Ma to account for possible delayed effects. The likelihood was conditioned on the survival 

419 of phylogeny in all runs and it was assumed that there are six species missing for Heliconiini and 

420 two for Heliconius (Lamas 2004). A possible slowdown in diversification rate after an initial rapid 

421 radiation was evaluated by calculating the gamma statistic (Pybus and Harvey 2000) and its 

422 significance tested in a Monte Carlo Constant Rates (MCCR) test with 1000 replicates in the 

423 package LASER (Rabosky 2006). 

424 
425 
426 
427 
428 
429 
430 
431 
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432 Results 

433 

434 Sampling and DNA Sequencing 

435 We successfully combined three approaches to sequencing, which resulted in the high 

436 taxonomic coverage and intraspecific sampling necessary for the MSC methods, and a sufficient 

437 sampling of loci for each individual (Online Appendix 1). Most of the dataset consists of Sanger 

438 sequences from 108 individuals, 26 of which were already published in GenBank. We also obtained 

439 two classic lepidopteran markers CoI/II and EFla from respectively 13 and 1 1 out of 14 historical 

440 specimens, thus adding eight species of Heliconiini that have not been sequenced previously. We 

441 did not manage to obtain any data from Philaethria andrei, P. browni, P. romeroi, Eueides emsleyi, 

442 E. libitina, Heliconius lalitae or H. metis (Lamas 2004; Constantino and Salazar 2010; Moreira and 

443 Mielke2010). 

444 We capitalised on the availability of Illumina data by generating de novo assembly contigs for 

445 further 57 individuals. The N50 of the Abyss assemblies ranged from 552 to 1921 bp (average 

446 1206) and all the nuclear markers were successfully recovered by megaBLAST from every 

447 assembly. Whole mitochondrial sequences of the same individuals were recovered by read 

448 mapping, with about a 400 bp stretch of the hypervariable control region (Heliconius Genome 

449 Consortium 2012) incomplete in some sequences. We obtained a depth of coverage over lOOx and 

450 high confidence in the base calls due to the high copy number of mtDNA in the tissue. Finally, we 

451 included the sequences extracted from the Heliconius melpomene genome, as well as the previously 

452 published mitochondrial sequence of Acraea issoria (Heliconius Genome Consortium 2012). 

453 The sequence data for 20 nuclear and 2 mitochondrial genes encompass 71 out of 78 (91%) 

454 of the officially recognised species of Heliconiini, including 44 out of 46 species from the focal 

455 genus Heliconius (Lamas et al. 2004; Beltran et al. 2007; Mallet et al. 2007; Constantino and 

456 Salazar 2010). Although the taxonomic validity of some species is contested, we found that the 

457 diversification analysis is robust to altering the number of missing species. Some recognised taxa 
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458 diverged very recently, as shown by the BEAST chronogram (Fig. 1) in case of the Philaethria 

459 diatonicalP. neildilP. ostara complex and the Heliconius heurippalH. tristero pair, the latter of 

460 which could be considered races of//, timareta (Nadeau et al. 2013). However, the exact 

461 relationship between genetic differentiation and taxonomic species identity in the highly variable, 

462 mimetic Heliconiini remains unclear (e.g. Merot et al. 2013). Importantly, 36 species are 

463 represented by multiple individuals, usually from distant populations, allowing for more accurate 

464 estimation of sequence evolution rates, and detection of species paraphyly. The number of 

465 individuals represented by each marker ranges from 40% for Hcl to 98% for Col/11, and only four 

466 specimens are represented exclusively by mitochondrial DNA (Online Appendix 1). 
467 

468 Pervasive Conflict Between the Loci 

469 Our nuclear markers span 1 1 out of 21 chromosomes (Table 1; Heliconius Genome 

470 Consortium, 2012) and have both autosomal and sex chromosome Z-linked inheritance. We 

471 examined the conflict between individual markers in the entire tribe and in the genus Heliconius 
All alone, using both gene tree summary methods and approaches utilising the raw sequence 

473 alignments. The maximum likelihood analysis of the core matrix in Concaterpillar rejected 

474 concatenation of any of the loci due to significant differences in both topology and substitution rate 

475 of individual partitions, but the exact nature of the discordance is unclear. A Multi-Dimensional 

476 Scaling ordination of pairwise RF distances between the gene trees does not reveal clustering by 

477 chromosome and the separation between many nuclear loci appears much greater than between 

478 nuclear and mitochondrial trees (Fig. 2). Consistent with this is the fact that the whole 

479 mitochondrial phylogeny of select taxa (Fig. 3) shows few differences from the tree based on the 

480 mixed marker supermatrix (Fig. 1), highlighting that cyto-nuclear discordance is not the primary 

481 source of incongruence in the dataset. 

482 The coalescent approaches reveal the high extent of marker conflict in the Heliconius data. 

483 Gene tree topologies from the explicit Bayesian modelling of incomplete lineage sorting (ILS) in 
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484 *BEAST are highly varied, with a particularly high degree of reticulation in the H. 

485 melpomenel cydno and the H. hecale (silvaniform) clades, where extensive horizontal gene flow has 

486 been observed previously (Sup. Fig. SI; Brown 1981; Martin et al. 2013). Another Bayesian 

487 method, BUCKy, infers the species in the presence of marker incongruence without modelling 

488 specific reasons for the observed discordance and calculates the Bayesian concordance factors that 

489 illustrate the proportion of partitions in the dataset that support a particular grouping (Ane et al. 

490 2007; Baum 2007; Larget et al. 2010). The concordance of the loci for Heliconius is strikingly low 

491 (Fig. 4), although the topology is consistent with the results of other analyses (Fig. 1 and 3, Sup. 

492 Fig. S2-S6). 

493 Further strong evidence of widespread incongruence comes from the NeighborNet network 

494 characterised by a high delta score of 0.276, which shows that the structure of the data is not 

495 entirely tree-like (Sup. Fig. S3). This can be partially attributed to the effect of missing data, yet 

496 even a fit based on the 30 species with little missing data produced a delta score of 0. 1 1, proving a 

497 substantial amount of non-bifurcating signal across the tribe (Holland et al. 2002). The most 

498 noticeable reticulations are found between nodes linking genera and the major clades of Heliconius, 

499 possibly due to pervasive gene flow during the diversification of the main extant lineages (Sup. Fig. 

500 S3). 
501 

502 Topological Consistency Across Optimality Criteria 

503 Although no trees are identical, the results from our Bayesian, Maximum Likelihood and 

504 distance-based network analyses of the supermatrix are very similar (Fig. 1, Sup. Fig. S3, S4). Two 

505 nodes stand out as unstable. Cethosia is variably placed as a sister taxon to either Acraeini (MP, 

506 ML, NeighborNet) or Heliconiini (Bayesian), despite the reasonably extensive sampling of 1 1 loci 

507 for C cyane, while the position of Podotricha in relation to Dryas and Dryadula varies between all 

508 analyses. Most problematic are relations among the species within Eueides, where the position of 

509 four out of 10 taxa cannot be resolved with good support. The poor resolution for both Eueides and 
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510 Podotricha can probably be attributed to insufficient site coverage, which produces high uncertainty 

511 due to patchily distributed missing data (Wiens and Morrill 201 1; Roure et al. 2013). We re- 

512 estimated the relations of Eueides based solely on a core set of 1 1 genes with coverage for at least 

513 7/10 species and recovered a much better supported tree (Sup.. Fig. S6). We recommend studies 

514 focusing on Eueides use this specific phylogeny, but the exact relations of E. procula, E. lineata 

5 1 5 and E. heliconioides remain unclear. 

516 Although our Bayesian maximum clade credibility tree is largely consistent with the 

517 topology estimated by Beltran et al. (2007), significantly increasing the dataset from 113 to 180 

518 individuals and five to 21 loci allowed us to resolve many critical nodes. The major differences are 

519 in the relations of the basal genera, which we infer to form a well-supported grade, with Eueides as 

520 the definitive sister genus of Heliconius (Fig. 1). Importantly, we confirm that the enigmatic genera 

521 Laparus and Neruda are nested within Heliconius, as further supported by ELW and SH tests 

522 (Table 2), although Neruda is closer to the base of the genus than previously estimated. We find 

523 that the other so called "primitive" (Brown 1981) clade of Heliconius consists of two separate 

524 groups, which are by no means basal to the other taxa, raising questions about the apparently 

525 unequal rate of morphological evolution in the genus. 

526 The maximum likelihood tree is similar to the Bayesian phylogeny, although characterised 

527 by lower overall support, especially for the nodes linking the genera and subgenera (Sup. Fig. S4). 

528 The nodes differing between the two trees are also the nodes that cannot be unequivocally 

529 confirmed by either the SH and the ELW test (Table 2). For instance, the highest posterior 

530 probability in the ELW test (0.21) is given to a tree placing H. heurippa as a sister species to H. 

531 timareta, a result recovered in both the Bayesian analysis and a recent genome-wide study (Nadeau 

532 et al. 2013). Thus we suggest that the Bayesian tree should be preferred as a more accurate picture 

533 of the phylogenetic relationships, although the ML tree based on the complete dataset is still useful 

534 to uncover multiple polyphyletic species. Notably, H luciana is nested within H elevatus, and//. 

535 wallacei is polyphyletic with respect to H. astraea. These results must be interpreted with caution, 
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536 as the inference relies on poorly covered museum specimens and may be sensitive to long branch 

537 attraction ( Wiens and Morrill 2011). 

538 Our whole mitochondrial phylogeny is largely consistent with the results of the multilocus 

539 supermatrix analysis and well-supported for 46/57 nodes (Fig. 3), despite the relatively limited 

540 taxonomic coverage of only 29 Heliconiini for which short-read data was available. In contrast to 

541 the multilocus dataset, mitochondrial genomes are not very useful for resolving relationships 

542 between major clades within Heliconius, as the position of Neruda, xanthocles and wallacei clades 

543 is poorly supported. An important deviation from the predominantly nuclear multilocus phylogeny 

544 is the placement of H. pachinus as a sister taxon of H. timareta, rather than H. cydno. This may 

545 reflect the overall instability resulting from a high extent of reticulation in the 

546 melpomenelcydnol timareta assemblage (Heliconius Genome Consortium 2012, Martin et al. 2013, 

547 Merot et al. 2013). Furthermore, we find a surprising positioning of H. hermathena within H. erato 

548 (Jiggins et al. 2008), which is also not supported by any of the analyses of the 21 locus matrix. The 

549 mitochondrial tree confirms previous observations of deep biogeographical splits in the widespread, 

550 highly diversified//, erato and its co-mimic H. melpomene (Brower 1994a). Within//, melpomene 

551 we find a well-supported distinction between races found to the West and East of the Andes, 

552 although our data place the individuals from French Guiana with the specimens from the Western 

553 clade, in contrast to a whole-genome phylogeny (Nadeau et al. 2013) (Fig. 1). H. erato shows the 

554 opposite pattern in both mitochondrial and nuclear data, whereby the Guianian races form a fully 

555 supported clade in the monophyletic group of taxa from East of the Andes. 

556 The variety of multi-species coalescent (MSC) methods that we applied brings a new 

557 perspective to the phylogenetic signals in the Heliconiini. Notably, different approaches and 

558 combinations of the data yield highly consistent topologies, although the support for the individual 

559 nodes varies. We used three techniques that attempt to deal with various aspects of the observed 

560 incongruence between loci. The maximum parsimony approach of MDC is a summary method 

561 deriving a species tree from the distribution of gene trees with single or multiple terminals per 
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562 species, intended to address the effects of ILS (Maddison 1997; Maddison and Knowles 2006). 

563 Although infrequently considered in recent studies, MDC has been implemented in multiple 

564 packages (Maddison and Maddison 201 1; Than et al. 2008) and offers superior speed of analysis 

565 when compared to other MSC techniques. In case of our complete dataset, a run with 100 bootstrap 

566 datasets took a few minutes. The resulting topology is nonetheless very different from the other 

567 results, showing a number of unexpected and poorly supported groupings. The lumping of all non- 
568 Heliconius genera, and the monophyletic Nerudalxanthocleslwallacei clade stand out in contrast to 

569 other proposed trees (Sup. Fig. S6). Interestingly, many of the relations that are poorly supported in 

570 the supermatrix phylogenies are also not resolved in the consensus MDC tree, showing that MDC is 

571 highly conservative with regard to the placement of taxa unstable in individual gene trees. 

572 The Bayesian MSC approach implemented in *BEAST uses MCMC to co-estimate the 

573 species-level tree and the contained gene trees by estimating demographic parameters together with 

574 the phylogeny, but only with an appropriate intraspecific sampling (Heled and Drummond 2010). 

575 Although computationally intensive, *BEAST models ILS and proposes a species tree without 

576 ignoring the underlying heterogeneity in specific locus genealogies, performing exceptionally well 

577 even with the most difficult cases combining large population size with short divergence times 

578 (Leache & Rannala 2010). We analysed the small dataset of the 17 best-sampled species and 

579 recovered a tree which agrees with the supermatrix analyses (Sup. Fig. S2), except for the position 

580 of Neruda aoede, which was placed as a sister taxon to the H. xanthocleslL. doris clade with 

581 relatively low posterior probability. Furthermore, the mean ages of nodes are nearly identical to 

582 those proposed in a supermatrix analysis, with similar 95% confidence intervals. Although the 

583 species tree is largely as predicted, we observe high levels of incongruence in the underlying 

584 distribution of gene trees (Sup. Fig. SI). Differences in the depth of coalescence are clear 

585 throughout the tree and reticulation is again especially apparent in the H. melpomene/cydno clade. 

586 The estimated population size values are also consistent with a previous comparison based on two 

587 nuclear loci, showing a higher population size of H, erato (1.33><10 6 individuals) when compared to 



Downloaded from http://biorxiv.org/ on September 18, 2014 

588 H. melpomene (1.02xl0 6 ) (Flanagan et al. 2004). 

589 While *BEAST is a powerful approach to account for ILS under a complex evolutionary 

590 model, it does not take into account various other sources of data heterogeneity. BUCKy derives a 

591 topology together with CFs, which represent the proportion of the genome supporting a particular 

592 clade. The most likely clade may therefore receive low support if it is represented in only a few of 

593 the Bayesian gene tree posteriors (Larget et al. 2010). The method is thus able to propose a robust 

594 topology in the face of ILS, hybridisation or other complex processes. As described above, we find 

595 the phylogeny derived by BUCKY to be entirely consistent with the Bayesian analysis of 

596 concatenated sequence, although the recovered CFs are much lower than any other measure of 

597 support applied to our data. Importantly, most of the nodes connecting the major clades in the tree 

598 have CFs below 0.5, with the notable exception of the silvaniform/me/pomene split (Fig. 4). The 

599 same nodes correspond to the reticulations in the NeighborNet analysis (Sup. Fig. S3), cases of low 

600 support in the MDC tree and its disagreement with the supermatrix analysis (Sup. Fig. S6), nodes 

601 that cannot be rejected in the ML tests of topologies (Table 2), and the uncertain nodes in the whole 

602 mitochondrial tree (Fig. 3). 
603 

604 Tempo of Diversification 

605 The phylogeny estimated under a relaxed clock model in BEAST shows diversification 

606 dynamics that differ from previous estimates, with the splits between the genera of Heliconiini 

607 being older and the extant species and subgenera much younger than previously thought (Table 

608 3)(Mallet et al. 2007). The most speciose genera Heliconius and Eueides separated 17 Ma and both 

609 started to diversify around 10 Ma, and the six major clades of Heliconius (corresponding to H. 

610 erato, H. sara, H. xanthocles, H. wallacei, H. melpomene/silwamforms and Neruda) all started to 

61 1 diversify between 5 and 4 Ma (Fig. 1). The lineage through time plot (LTT) for Heliconiini suggests 

612 a period of stasis corresponding to the mid-Miocene 16-11 Ma (Hoorn et al. 2010), and followed by 

613 a sudden increase in the number of extant lineages (Fig. 5a). In case of the 45 Heliconius species, a 
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614 shorter slowdown is found between 7 and 4.5 Ma (Fig. 5b) (Hoorn et al. 2010). As expected from 

615 the LTT plots, the MCCR test detects no evidence for an overall slowdown in the diversification 

616 rate (Heliconiini: gamma= 2.643, p=0. 996; Heliconius: gamma= 0.153,/>=0.617). 

617 Maximum Likelihood (ML) modelling strongly supports the Birth-Death model with rate 

618 shift as the best fit for Heliconiini (Akaike weight of 0.92; Table 4). Both DDD and turboMEDUSA 

619 models demonstrate that around 1 1 Ma the speciation rate of Heliconiini increased dramatically 

620 from 0. 19 to 0.40 new species per lineage per million years, and the shift was accompanied by an 

621 increase in the turnover rate from 0.11 to 0.65. The results for Heliconius are less clear, as equal 

622 Akaike weights (0.30) are given to the Pure Birth models without rate changes, and with an almost 

623 twofold increase in speciation rate 4.5 Ma (Table 4). TurboMEDUSA and DDD models allowing 

624 for a shift at an unspecified time also find that speciation accelerated 10-11 Ma and 4-5 Ma. 
625 
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640 Discussion 

641 

642 Stable Topology Despite Marker Incongruence 

643 We approached the problem of phylogeny reconstruction in a difficult mimetic assemblage 

644 through extensive intraspecific sampling of 22 markers from nearly all species in the clade, and 

645 compared the results between multiple philosophically distinct analytical approaches. As next 

646 generation sequencing technologies become widely accessible and the average number of loci used 

647 in systematic studies increases rapidly, Multi-Species Coalescent (MSC) methods gain in 

648 importance as means of detecting and accounting for incongruence in multilocus data (e.g. Lee et al. 

649 2012; Barker et al. 2013; Smith et al. 2013). However, their relative merits and utility at different 

650 levels remains contested (Song et al. 2012; Gatesy and Springer 2013; Reid et al. 2013). The 

65 1 systematic relations of the tribe Heliconiini, which diverged from its extant outgroup about 47 Ma, 

652 can be effectively resolved with both MSC and supermatrix approaches, yielding highly similar 

653 topologies across a range of different sub-sampling schemes that correspond to the requirements of 

654 individual algorithms (Fig. 1, 3 and 4; Sup. Fig. S2-S6). Nonetheless, consistent with the recent 

655 radiation of the group, large effective population sizes and known hybridisation between many 

656 species, we observed high heterogeneity among sampled fragments of the genome that differ 

657 markedly in both topology (Fig. 2, Sup. Fig. SI) and rates of evolution (Concaterpillar analysis). 

658 Such heterogeneity might have been expected to pose a significant challenge for the concatenation 

659 methods (Degnan and Rosenberg 2006; Edwards et al. 2007; Edwards 2009; Knowles and Kubatko 

660 2010; Leache and Rannala 2010). However, the only method producing an obviously different 

661 phylogeny is Minimising Deep Coalescences (MDC; Sup. Fig. S6), which fails to resolve 34 out of 

662 62 nodes in the tree with bootstrap support above 0.9, and is the only method to suggest monophyly 

663 of Eueides with other basal genera of Heliconiini. MDC derives a species tree from point estimates 

664 of gene trees and can be expected to perform poorly with a relatively limited number of gene trees 

665 that are not always fully resolved, leading to a complete polytomy in some of the clades (Gatesy 

666 and Springer 2013). However, the MDC result is an indicator of instability, as well-resolved and 
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667 consistent gene trees should produce a good quality MDC tree. 

668 Recent studies have proposed that likelihood-based MSC techniques perform better than and 

669 should be preferred to integration over individual gene trees, due to their potential to capture 

670 synergistic effects between partitions (Leache and Rannala 2010; Reid et al. 2013), mirroring the 

671 phenomenon of hidden support found when multiple markers are concatenated (Gatesy and Baker 

672 2005). Our results support this observation and further show that high degrees of conflict between 

673 many partitions can be reconciled by both supermatrix and MSC approaches to extract the 

674 predominant signal of speciation. The superiority of the MSC methods lies in the effective 

675 demonstration of incongruences, represented by lower support values or concordance factors 

676 assigned to the more difficult nodes (Belfiore et al. 2008). The Bayesian concordance analysis in 

677 BUCKy assigns insignificant concordance factors to most of the nodes separating major subgenera 

678 of Heliconius. Two of these nodes (H. wallacei and N. aoede; Fig. 4) are also only weakly 

679 supported by the *BEAST coalescent model, and correspond to the areas of high reticulation in the 

680 NeighborNet network, reflecting conflicting signals (Sup. Fig. S3). The same nodes are all assigned 

681 a posterior probability of one in the Bayesian supermatrix analysis (Fig. 1), potentially leading to 

682 the erroneous conclusion that all the data point unequivocally to the inferred relations. 

683 Our study highlights an important practical consideration in choosing the optimal analytical 

684 approach, where the requirements of the selected algorithm have to be reconciled with a realistic 

685 sampling of taxa. Our final goal of testing hypotheses regarding the diversification dynamics of 

686 Heliconiini can be met only if the number of included taxa is maximised. Despite a substantial 

687 effort we only managed to secure single samples of the rare or geographically restricted species, and 

688 some of them are represented solely by historical specimens with limited potential to generate 

689 extensive multilocus data. Considering that our study group is intensively studied and exceptionally 

690 well represented in research, museum and private collections due to its aesthetic appeal (Mallet et 

691 al. 2007), it would be considerably more challenging to obtain a complete sampling of many other 

692 groups. Another difficulty stems from the fact that the advanced coalescent techniques like BUCKy 
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693 and *BEAST perform best with multiple samples per species, which should capture intraspecific 

694 diversity (Heled & Drummond 2010, Reid et al. 2013). Finally, much of the uncertainty in the 

695 estimates can be attributed to missing data, which can negatively affect the estimation of both 

696 individual gene trees and the encompassing species tree (Wiens and Morrill 201 1 ; Roure et al. 

697 2013). When fitting a NeighborNet network, we found that although the percentage of data missing 

698 from the matrix does not explain all of the observed reticulation and the high delta score, it causes 

699 these parameters to increase, thus suggesting that the data completeness at each alignment position 

700 must be maximised to identify genuine incongruence. We observe that in many cases of biological 

701 interest it will be a formidable challenge to generate the ideal dataset that (i) has little missing data, 

702 (ii) comprises a large, genome-wide sample of loci, (iii) includes all taxa, and (iv) captures 

703 intraspecific variability. In case of Heliconiini, the supermatrix approach based on a limited number 

704 of markers (22) helped us to maximise taxonomic inclusiveness without compromising our ability 

705 to reconstruct a phylogeny in the light of conflicting biological signals. 
706 

707 Divergence time estimates 

708 Our phylogeny of Heliconiini brings novel insight into the diversification dynamics of the 

709 clade. Although most age estimates for Heliconius agree with other studies, the deeper nodes are 

710 older than previously suggested (Table 3). There is little agreement on the dates above the species 

71 1 level, and the studies to date either suffer from insufficient taxon sampling (Pohl et al. 2009, 

712 Wahlberg et al. 2009, Cuthill and Charleston 2012; Table 3), or use markers unlikely to be 

713 informative above a relatively low level of divergence (Mallet et al. 2007). For instance, the mean 

714 age of the split between Heliconius and Agraulis is estimated as 32 Ma in the study of Pohl et al. 

715 (2009), which includes only 3 species of Heliconiini; 26.5 Ma in Wahlberg et al. (2009), including 

716 one species per genus; or 21 Ma in the present study (Fig. 1; Table 3). Conversely, Mallet et al. 

717 (2007) find the divergences between the basal genera of Heliconiini to be much younger than we 

718 propose, likely due to an effect of using a fast-evolving mitochondrial locus without partitioning 
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719 (Brandleyetal. 2011). 

720 Our own ability to correctly falsify the hypotheses regarding the diversification of 

721 Heliconiini hinges on having a nearly complete phylogeny, yet none of our MSC analyses consider 

722 as many species of Heliconiini as the Bayesian supermatrix estimate. We are confident that the 

723 values proposed by the supermatrix method can be trusted, as both the topology and the branch 

724 lengths are consistent with the results inferred by *BEAST based on a smaller dataset (Table 3; 

725 Sup. Fig. S2). We find that the ages of the deeper nodes and the length of terminal branches are not 

726 inflated by the supermatrix method in comparison to *BEAST, contrary to the predictions from 

727 simulations (Burbrink and Pyron 201 1), and both methods infer similar mean age for the observed 

728 splits, for instance 10.5 Ma for the basal divergence of Heliconius into the H. erato and H. 

729 melpomene lineages, or 2-3 Ma for the diversification of H. melpomene and silvaniform clades. 

730 Hence we offer a new perspective on the dating of Heliconiini radiation with a nearly complete set 

73 1 of divergence time estimates based on a well-resolved and supported tree. 

732 
733 

734 Rapid Adaptive Radiation o/Heliconius 

735 Heliconius have undergone a rapid adaptive radiation sensu Schluter (2000), associated with 

736 the evolution of Miillerian mimicry, close coevolution with the Passifloraceae host plants (Benson 

737 et al. 1975) and other traits unique among the Lepidoptera (reviewed by Beltran et al. 2007). 

738 Maximum Likelihood models demonstrate that around 1 1 Ma the speciation rate of Heliconiini 

739 increased dramatically from 0.19 to 0.40 new species per lineage per million years, and the shift 

740 was accompanied by an increase in the turnover rate from 0.11 to 0.65, indicating higher likelihood 

741 of extinction of the lineages. Our results are consistent with the radiation having been stimulated by 

742 environmental factors such as mountain uplift, and creation of elevation gradients facilitating 

743 parapatric speciation (Wesselingh et al. 2009, Hoorn et al. 2010, Turchetto-Zolet et al. 2013). 

744 Consistent with the inference of the Central and Eastern Andean slopes as the "species pump" of 
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745 Heliconiini (Rosser et al. 2012), we find that diversification of Heliconius and Eueides into 

746 subgeneric clades occurred within the two million years after the major Andean uplift event 12 Ma, 

747 and that the modern species of Heliconius started to diversify after the last intensive orogenic event 

748 4.5 Ma (Gregory- Wodzicki 2000). Contrary to our expectations, mimicry does not leave a clear 

749 signature of positive feedback on the rate of diversification. Although mimetic interactions have 

750 been repeatedly invoked as drivers of speciation in Heliconiini and butterflies generally (Bates 

751 1862; McMillan et al. 1997; Mallet and Joron 1999; Elias et al. 2008; Jiggins 2008; Savage and 

752 Mullen 2009) and mimicry can possibly decrease the risk of extinction as a consequence of reduced 

753 predation rates (Vamosi 2005), model fitting does not support an exponential increase in rates of 

754 speciation. 

755 Although we find sudden increases in speciation rates, there is no indication that the rates 

756 slowed down significantly afterwards. This result is consistent with a recent critique (Day et al. 

757 2013) of the traditional prediction that large radiations should display a pattern of an initially high 

758 net diversification rate, decreasing as the ecological niches fill up (e.g. Rabosky and Lovette 2008). 

759 This expectation is reasonable for spatially limited island radiations that constitute the majority of 

760 study cases to-date, but it does not necessarily apply to continental radiations in the tropics, where 

761 the scale and complexity of the ecosystems are likely to generate a number of suitable niches 

762 greatly exceeding even the cladogenetic potential of large radiations (Day et al. 2013). So far, 

763 steady diversification of a widely distributed taxon has been demonstrated in the African Synodontis 

764 catfish (Day et al. 2013) and the Neotropical Furnariidae ovenbirds (Derryberry et al. 201 1), but the 

765 generality of this pattern remains unknown. Heliconiini constitute an interesting case of a 

766 widespread, continental radiation where at least one sharp transition to a higher species turnover 

767 punctuates an otherwise steady diversification. Thus the physical environment has acted a driver of 

768 diversification, but did not limit cladogenesis. 

769 Some uncertainty surrounds the last few million years of evolution of Heliconiini. Early 

770 speculation regarding the drivers of speciation suggested diversification in allopatry, as the 



Downloaded from http://biorxiv.org/ on September 18, 2014 

771 rainforest habitat occupied by most species in the group has undergone cycles of contraction and 

772 expansion in response to recent climatic variation (Turner 1965; Brown et al. 1974; Brown 1981; 

773 Sheppard et al. 1985; Brower 1994a). The hypothesis of vicariant cladogenesis has been 

774 subsequently criticised due to a lack of evidence for forest fragmentation in pollen core data 

775 (Colinvaux et al. 2000, Dasmahapatra et al. 2010), as well as the likelihood of parapatric speciation 

776 (Mallet et al. 1998). The decrease in observed diversification over the last million years is likely due 

777 to our limited ability to delineate species in the assemblages of highly variable taxa like Heliconius 

778 - a phenomenon that has been termed "protracted speciation" (Etienne & Haegemann 2012). We 

779 provide direct evidence that most specific divergences between species of Heliconiini occurred 

780 during the Miocene, and that both climate and orogeny strongly influenced the pattern of 

781 diversification, although we cannot directly test the causes of this association with these data. 
782 

783 Taxonomic and Systematic Implications 

784 Application of novel algorithms to the Heliconiini data reveals a relatively stable topology 

785 despite limited support. We uphold the previous re-classification of the species in the genera 

786 Neruda (Huebner 1813) and Laparus (Linnaeus 1771) as Heliconius based on their nested position 

787 (Beltran et al. 2007). This placement is at odds with morphological evidence from adult and larval 

788 characters, which puts Neruda and Laparus together with Eueides (Penz 1999), suggesting 

789 convergent evolution of homoplasious morphological characters. Nevertheless, the molecular 

790 evidence is decisive and we thus synonymize Laparus syn. nov. and Neruda syn. nov. with 

791 Heliconius. 

792 The position of the enigmatic genus Cethosia remains unresolved, as it currently depends on 

793 the chosen method of analysis, and is unlikely to be established without a broad sampling of species 

794 using multiple markers. Cethosia has been variably considered to be either the only Old World 

795 representative of Heliconiini (Brown 1981; Penz and Peggie 2003; Beltran et al. 2007), a genus of 

796 Acraeini (Penz 1999; Wahlberg et al. 2009), or possibly a distinct tribe (Muller and Beheregaray 
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797 2010). Establishing the systematic relations between Acraeini, Cethosia and Heliconiini is 

798 important for the study of Heliconiini macroevolution, as it could shed light on the origins of the 

799 heliconian affinity for the Passifloraceae host plants. Known records indicate that Cethosia species 

800 feed on Adenia and Passiflora (Miiller and Behergaray 2010), raising the possibility that 

801 Heliconiini may have evolved from an Australasian ancestor that already exhibited the modern 

802 dietary preference. 

803 Heliconiini have been at the centre of the debate about species concepts and designation 

804 criteria, providing empirical evidence for the permeability of species barriers (Mallet et al. 2007, 

805 Nadeau et al. 2012, Martin et al. 2013). At the same time, new species continue to be described 

806 based on morphological, genetic or karyotypic evidence (Lamas et al. 2004; Constantino and 

807 Salazar 2010; Dasmahapatra et al. 2010; Moreira and Mielke 2010), although the validity of the 

808 taxonomic status of some of these new lineages is disputed (J. Mallet and A. Brower, pers. comm.). 

809 Our diversification analysis is therefore prone to error resulting from the uncertainty in the number 

810 of terminal taxa that should be considered. We assume the number of species in the clade to be 78, 

811 reflecting the published taxonomy (Lamas 2004; Constantino and Salazar 2010; Rosser et al. 2012), 

812 and we deliberately exclude a formally undescribed species of Agraulis. Although we included H. 

813 tristero at the time of our analysis, it is not considered a species any longer (Merot et al. 2013). In 

814 addition, a recent study suggests that Philaethria pygmalion and Philaethria wernickei may in fact 

815 constitute a single species (Barao et al. 2014). Although the number of Heliconiini species may 

816 therefore vary between 73 and 79, changing the number of missing species in the diversification 

817 analyses does not alter the results substantially. 
818 

819 Conclusion 

820 We present a taxonomically comprehensive phylogeny of a large continental radiation 

821 characterised by extensive hybridisation, adaptive introgression and rapid speciation, and emerging 

822 as a prominent system for comparative genomics. The tribe Heliconiini has radiated in association 
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823 with environmental change in the Neotropics over the past 23 Ma and that speciation and species 

824 turnover increased dramatically at the end of the Mid-Miocene. Although there is a clear signature 

825 of processes leading to incongruence of individual gene trees and the species tree, we consistently 

826 recover the same topology using different analytical approaches. We conclude that the established 

827 supermatrix methods perform well, but are less effective in detecting the underlying conflict and 

828 estimating nodal support than the alternative Multispecies Coalescent techniques. 

829 Miillerian mimicry affects the mode of speciation in Heliconius, and probably plays a role 

830 for the other, less researched genera of Heliconiini as well. Parapatry and sympatry are likely of 

831 importance in the ecological speciation of the butterflies, due to the dual role of wing patterns as 

832 both aposematic warnings under strong selection and essential cues for mating (Jiggins et al. 2001, 

833 Merrill et al. 2012). As incipient species diverge in their patterns by co-evolving with differently 

834 patterned co-mimics from multiple sympatric mimicry rings, assortative mating can lead to further 

835 genetic isolation of the lineages, enabling further divergence (Merrill et al. 2012). Conversely, the 

836 adaptive value of locally advantageous patterns facilitates adaptive introgression of pattern loci 

837 between species (Mallet et al 2007, Heliconius Genome Consortium 2012, Martin et al. 2013). 

838 Mimicry is thus an important factor in enhancing gene flow between species of Heliconius and 

839 Eueides, likely contributing to the observed levels of discordance between sampled markers. The 

840 seemingly unusual aspects of our study system, including introgression, hybridisation and 

841 speciation in sympatry, have all been only recently recognised as important evolutionary processes. 

842 We expect that reticulate signals in phylogenetic data will be increasingly important in future 

843 systematic studies and the Heliconiini represents a case study where the knowledge of specific 

844 microevolutionary processes can inform our understanding of cladogenesis at deeper timescales. 
845 

846 
847 

848 
849 
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880 Figure 1 . Bayesian phylogeny of 71 out of 78 butterflies in the tribe Heliconiini with outgroups, 



881 estimated using 22 markers with an uncorrected molecular clock method (BEAST). The age of the 



882 root is calibrated based on the results of Wahlberg et al. (2009) and the bars signify the 95% 
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883 confidence intervals around the mean node ages. The dots indicate the Bayesian posterior 

884 probability of the splits: >99% (black), 95-99% (grey), <95% (white). Scale axis in Ma. Red 

885 triangles indicate the periods of Andean orogenesis 23, 12 and 4.5 Ma and the blue asterisk marks 

886 the end of the Mid-Miocene Global Optimum (Hoorn et al. 2010). Deep splits are shown within the 

887 well-studied Heliconius erato and H. melpomene. FG=French Guiana. Heliconiini exhibit complex 

888 patterns of divergence and convergence in aposematic wing patterns, top to bottom: Actinote latior 

889 (outgroup), Eueides tales michaeli, E. lampeto lampeto, H. telesiphe telesiphe, H. erato favorinus, 

890 H. demeter ucayalensis, H. sara, H. aoede, H. doris (blue morph), H. burneyi jamesi, H. 

891 melpomene amaryllis, H. timareta ssp., H. numata lyrcaeus, H. pardalinus dilatus. 
892 
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91 1 Figure 2. Multi-dimensional scaling ordination of pairwise Robinson-Foulds distances between 21 

912 gene trees fitted in the Tree Set Visualiser shows no clear patterns of incongruence between the 

913 markers. Phylogenies of two mitochondrial and 20 nuclear markers were estimated in MrBayes. 

914 The axes have no units. 
915 
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Figure 3. Whole mitochondrial maximum likelihood (RAxML) phylogeny of the genus Heliconius. 
Bootstrap support values indicated. Scale bar in units of substitution per site per million years. 
B=Brazil, Bo=Bolivia, C=Colombia, E=Ecuador, FG=French Guiana, P=Panama, Pe=Peru. 
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922 Figure 4. A phylogenetic hypothesis for Heliconius showing the extent of concordance between tree 

923 topologies of the 21 loci estimated by Bayesian concordance analysis in BUCKy. Dots indicate the 

924 Concordance Factor values for the nodes, with darker shades of grey corresponding to lower 

925 support values. B=Brazil, Bo=Bolivia, C=Colombia, E=Ecuador, FG=French Guiana, P=Panama, 

926 Pe=Peru. 
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928 

929 Figure 5. Lineage through time (LTT) plots based on the calibrated, uncorrelated molecular clock 

930 analysis in BEAST. A. Semilogarithmic LTT plot of the Heliconiini butterflies with the maximum 

931 clade credibility phylogeny in black and a random sample of 200 posterior trees around. B. LTT 

932 plot for Heliconius. 
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944 Table 1 . Position in the genome, variability statistics and models of sequence evolution for two mitochondrial and 20 nuclear markers used in the 

945 study. Statistics are reported for the core alignment including outgroups. 
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Laparus not in Heliconius 


-143233 


0.000 


^<0.05 


H. erato monophyletic 


-142592 


0.114 




H. ricini basal to H. demeter 


-142590 


0.084 




H. antiochus basal to H sara 


-142588 


0.170 




primitive clade monophyletic 


-142613 


0.000 


p<0.05 


H. wallacei monophyletic 


-142584 


0.115 




H. elevatus monophyletic 


-142590 


0.041 




H. cydno monophyletic 


-142619 


0.001 


p<0.05 


H. heurippa with H. timareta 


-142586 


0.209 




H melpomene East+Guiana clade 


-142620 


0.011 


/?<0.05 



949 

950 Table 2. Maximum likelihood tests of alternative topologies. Phylogenies were estimated under 

95 1 constraints enforcing alternative topologies and compared to the original ML tree with the 

952 Shimodaira-Hasegawa test and the Expected Likelihood Weights test with 100 bootstrap replicates. 

953 
954 
955 
956 
957 
958 
959 
960 
961 
962 
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Heliconius 
vs Eueides 


Heliconius vs 
Agraulis 


Heliconius vs 
Philaethria 


H. melpomene 
vs H. erato 


H erato vs 
H. hecalesia 


Mallet et al. 


11.0 


13.0 


14.5 


9.5 


4.0 


2007 












Pohl et al. 


n/a 


32.0 


n/a 


17.0 


n/a 


2009 




(40.0-24.0) 




(11.0-23.0) 




Wahlberg et 


18.5 


26.5 


30.0 


n/a 


n/a 


al. 2009 


(12.5-24.5) 


(21-32) 


(36.5-23.5) 






Cuthill and 


13.0 


n/a 


n/a 


10.5 


4.5 


Charleston 


(10.5-16.5) 






(8-13.5) 


(2.8-6.3) 


2012 












BEAST 


16.7 


21.8 


23.8 


10.2 


3.6 




(14.1-19.5) 


(18.6-25.5) 


(20.3-27.7) 


(8.7-12.0) 


(3.0-4.4) 


*BEAST 


17.5 


n/a 


n/a 


10.6 


n/a 




(11.6-23.4) 






(14.9-6.7) 





963 

964 Table 3. Mean split ages in Ma at different levels of divergence within Heliconiini, as estimated by 

965 previous studies and by two Bayesian relaxed clock methods in the present work. Mean ages and 

966 95% confidence intervals are reported where available. 

967 
968 
969 
970 
971 
972 
973 
974 
975 
976 
977 
978 
979 
980 
981 
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Model 


I 


u 






K 


K 2 


-InL 


AICc 


wAIC 


Heliconiini 


PR 


0 23 












-1 58 48 


319 01 

J 1 J . U 1 


0 00 

V7 . V 7 V 7 


JTJ_> 9 Mill I J. .J IVld 


0 17 

u. i / 


0 31 

U.J) 1 










-1 55 43 


31 5 04 

J) 1 J .U4 


0 00 

u.uu 


PR shift 4 5 Ma 


0 1 3 

U. 1J 


0 33 










-1 52 27 

± JZ, .Z, / 


308 72 

JUO. / Z, 


0 03 


PR shift 1 1 Ma 


0 1 1 


0 26 










-1 55 57 

1 J J . J / 


31 5 32 

J 1 J .Ji- 


0 00 

u.uu 


PR shift 1 2 Ma 


0 1 1 


0 25 










-1 55 82 

1JJ. Oi. 


315 81 

JlJ.Ol 


0 00 

V7 . V7 V 7 


PR anv shift 


0 12 


0 32 










-151 25 

1 J 1 .Z, J 


308 86 


0 03 




0 38 




0 26 








-1 54 62 


313 41 


0 00 

u.uu 


RD shift 3 5 Ma 

JJJ_/ ? ollllL J.J 1VJ.U. 


0 31 


0 31 


0 23 


0 00 

U. UU 






-1 53 50 


315 61 

Jl J.U1 


0 00 

U. UU 


RD shift 4 5 Ma 


0 16 


0 33 

v.JJ 


0 04 

U.Ut 


0 00 

u.uu 






-1 52 27 


313 14 

J 1 J. 1" 


0 00 

U.UU 


RT) shift T T Ma 


U.XZ7 




0 02 


0 26 






-T46 70 


^02 00 


0 Q2 


RF) shift 1 2 Ma 

_D_L/ 9 51111 L 1Z, IVld 


0 21 


0 37 

U.J) / 


0 02 

u.uz. 


0 25 






-151 85 

U 1 . 0 J 
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0 01 

U.U 1 


Rf) anv shift 

lj V_J 9 till y jini L 


0 26 


0 37 

U.J / 


0 16 


0 23 






-1 53 82 

i j j . oz 


31 8 56 


0 00 
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0 23 

U.Z, J 
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-1 58 48 
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0 00 

U. UU 


TYTYF1 

l_y l_y Lv 1 


0 1^ 

U. J J 




0 90 




1 6x1 0 5 

1 U A 1 U 






114 QO 

J 1^.7U 


0 00 

u.uu 


DDE5 


0.36 




0.24 




2190 




-153.64 


313.63 


0.00 


DDE1, shift 3.5 Ma 


0.31 


0.42 


0.23 


0.00 


7912 


208 


-153.41 


320.13 


0.00 


DDE1, shift 4.5 Ma 


0.15 


0.45 


0.04 


0.01 


Inf 


171 


-151.97 


317.25 


0.00 


DDE1, shift 11 Ma 


0.94 


0.75 


0.40 


0.38 


33 


68 


-155.09 


323.49 


0.00 


DDE1, shift 12 Ma 


0.17 


0.30 


0.00 


0.07 


26 


941 


-154.36 


322.02 


0.00 


DDE1, any shift 


0.16 


0.49 


0.01 


0.07 


71 


152 


-151.18 


318.13 


0.00 



Heliconius 



PB 



0.42 



-73.87 149.82 



0.30 
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PB, shift 3.5 Ma 


0.38 


0.44 










-73.77 


151.82 


0.11 


PB, shift 4.5 Ma 


0.26 


0.47 










-72.79 


149.86 


0.30 


BD 


0.43 




0.02 








-73.86 


152.00 


0.10 


BD, shift 3.5 Ma 


0.60 


0.44 


0.39 


0.00 






-73.26 


155.50 


0.02 


BD, shift 4.5 Ma 


0.26 


0.47 


0.00 


0.00 






-72.79 


154.56 


0.03 


DD-E 


0.42 








Inf 




-73.87 


152.01 


0.10 


DDE1 


0.45 




0.07 




6xl0 5 




-73.88 


154.34 


0.03 


DDE5 


0.43 




0.02 




Inf 




-73.86 


156.70 


0.01 


DDE1, shift 3.5 Ma 


0.39 


0.60 


0.02 


0.00 


llxlO 6 


111 


-73.51 


161.16 


0.00 


DDE1, shift 4.5 Ma 


0.26 


0.70 


0.00 


0.00 


Inf 


85 


-72.18 


158.51 


0.00 


DDE1, any shift 


0.38 


0.45 


0.01 


0.00 


128 


9710 


-73.79 


164.53 


0.00 



983 

984 Table 4. Results of diversification model fitting to the maximum clade credibility chronograms of 

985 Heliconiini and Heliconius, estimated under a relaxed molecular clock model in BEAST. Models 

986 are compared in terms of corrected AIC (AICc) and relative Akaike weights (wAIC). Models: 

987 PB=Pure Birth, BD=Birth-Death, DD-E=Diversity Dependence without Extinction, 

988 DD+El=Diversity Dependence with Extinction and a linear scaling of speciation term, 

989 DD+E5=Linear Dependence With Extinction and linear scaling terms for speciation and extinction. 

990 Where indicated, models include shifts in speciation and extinction rates at fixed times, or at any 

991 time. 
992 

993 
994 
995 
996 
997 
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Supplementary Figure SI. Gene tree estimates for well-sampled Heliconius species from the Bayesian multispecies 
coalescent analysis in *BEAST. Reticulation reaches the highest levels in the Heliconius melpomene clade. 
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Supplementary Figure S2. Species tree from the * BEAST analysis. The times of divergence were estimated using a 
relaxed molecular clock with the prior on the mean age of the root based on a dating of the Nymphalidae radiation 
(Wahlberg et al. 2009). Node bars show the 95% confidence intervals around the mean age. The scale axis indicates 
time in Ma. 
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1438 

1439 Supplementary Figure S3. NeighborNet network of Heliconiini based on pairwise DNA distances computed under the 

1440 F84 model of sequence evolution in SplitsTree v. 4. Reticulation is especially pronounced at the vertices linking genera 

1441 and subgenera of Heliconius. The scale bar is in units of average number of substitutions per site. 
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1447 Supplementary Figure S4. Maximum Likelihood (RAxML) phylogeny of Heliconiini butterflies based on two 

1448 mitochondrial and 20 nuclear genes, estimated under the GTR+G model. Support values are estimated based on 1000 

1449 bootstrap replicates. All specimens are included in this analysis. The scale bar is in units of average number of 

1450 substitutions per site. 
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Supplementary Figure S5. Maximum Likelihood (RAxML, GTR+G model) phylogeny of the genus Eueides based on 
1 1 loci sampled in most of the species. Support values based on 1000 bootstrap repliactes, scale bar is in units of 
average number of substitutions per site. 
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1486 Supplementary Figure S6. Minimise deep coalescences (MDC) phylogeny of Heliconiini: a 50% consensus of 100 

1487 bootstrap replicates of the set of 22 Bayesian gene trees. 
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